Draft version March 25, 2008 

Preprint typeset using I£T^X style cmulateapj v. 10/09/06 



HALO ASSEMBLY BIAS IN HIERARCHICAL STRUCTURE FORMATION 
Neal Dalal 1 , Martin White 2 , J. Richard Bond 1 and Alexander Shirokov 1 

Draft version March 25, 2008 

ABSTRACT 

We investigate the origin of halo assembly bias, the dependence of halo clustering on assembly 
history. We relate halo assembly to peak properties measured in the Lagrangian space of the initial 
linear Gaussian random density held, and show how these same Lagrangian properties determine 
large-scale bias. We focus on the two regimes where assembly bias has been observed to be significant: 
at masses very large and very small compared to the nonlinear mass scale. At high masses, we show 
that assembly bias is expected from the statistics of the peaks of Gaussian random fluctuations, and 
we show that the extent of assembly bias found in N-body simulations of rare halos is in excellent 
agreement with our theoretical prediction. At low masses, we argue that assembly bias largely arises 
from a sub-population of low mass halos whose mass accretion has ceased. Due to their arrested 
development, these halos naturally become unbiased, in contrast to their anti-biased peers. We show 
that a simple toy model incorporating these effects can roughly reproduce the bias trends found in 
N-body simulations. 

Subject headings: cosmology:theory - methods:numerical - dark matter: merging histories - galaxies: 
clusters 



1. INTRODUCTION 

Hierarchical structure formation is a basic prediction 
of the highly successful cold dark matter cosmological 
model, and a key element of hierarchical growth is the 
formation of bound virialized objects called dark mat- 
ter halos. Dark matter halos are not distributed uni- 
formly throughout the universe, but rather compose a 
beaded filamenta ry network, some times referred to as 
the cosmic web (jBond et alJl!996t ). The clustering of 
the rare, massive dark matter halos is enhanced rel- 
ative to the general mass distribut i on (iKaiserl [ 1984; 
Efstathiou et all Il98& ICole fc Kaiserl 119891: iBond et al.l 
1991k iMo fe Whitdll996l: ISheth fe Tormenlll999D . an ef- 
fect known as bias. The more massive the halo, the 
larger the bias. As a result, the mass of halos hosting 
a given population of objects is sometimes inferred by 
measuring their degree of clustering - allowing a statis- 
tical route to the notoriously difficult problem of mea- 
suring masses of cosmological objects. There are many 
applications for statistical mass determination, includ- 
ing ( for example) cluster mass-observa ble self-calibration 
(e.g.Oma & Hu 200l lWu et aT1l200l) or reconstruction 
of th e halo occupation distr ibution of galaxies or quasars 
(e.g. iCoorav fc Sheth|[200l . 

Since halos of a given mass can differ in their forma- 
tion history, properties and large-scale environment 3 , a 
natural question arises: do these details affect halo clus- 
tering? Numerical simulations are now able to produce 
samples with sufficient statistics to test for the depen- 
dence of clustering on history, properties and environ- 
ment and recent numerical simulations reveal subtle de- 
pendences of halo clusteri ng on formation redshift, in- 
tcrnal structure and spin (jOao. Springel fc Whitel [20051 : 
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3 The large-scale environment of a halo refers to the density, 
smoothed on some suitably large scale, e.g. 8h _1 Mpc. 



I Wechsler et alJl200felHarker et al.ll200llBett et al.ll2007t 

Wetzel et al. 2007; Jing. Suto fc Mo 2007: iGao fc Whitel 
120071 : lAngulo. Baugh fc Lacevll2008l ) 

Usin g a marked correlation function, ISheth fc Tormenl 
(|2004f ) found that close halo pairs tend to have earlier 
formation times than more dista nt pairs, wor k whic h 
was extended and confirmed by lHarker et al.l {2006). 
IGao. Springel fc Whitel (|2005f) found that later forming, 
low-mass halos are less cluste red than typical halos o f the 
same mass a t the p resent. IWechsler et al.l (|2006l ) and 
IWetzel et al.l (|2007l ) found a similar dependence upon 
halo formation time (defined in a slightly different way), 
showing that the trend reversed for more massive halos 
and that the clustering depend ed on halo concentration. 
Recently, IGao fc Whitel (|2007l ) showed that a wide vari- 
ety of halo properties correlate with bias, inclu ding spin 
and substructure content, while ILT et alJ (|2008h showed 
that different definitions of halo age, when applied to the 
same halo samples, give different amounts of assembly 
bias. 

What remains unclear from this numerical work is 
the ph ysical origin of this effect. IGao. Springel fc White 
(l2005h argued that assem bly bias is inconsistent with the 
iPress fc Schechterl (|1974l ) model, which explicitly pre- 
dicts that halo abundance (and hence clustering) de- 
pends only upon halo mass and no other internal prop- 
erties related to assembly history. Attempts to explain 
assembly bias based on the Press-Schechter approach 
(i.e. the excursion set using a sharp fc-space filter) would 
therefore appear doomed from the outset, however sev- 
eral workers have explored modifications to the Press- 
Schechter inojM_in_the hopes of accounting for assem- 
bly bias. ISandvik et al.l (|2007l ) considered excursion set 
mass functions using barrier heights determined fro m the 
ellipsoidal collapse model of IBond fc Mversl (|1996| ). but 
were unable to account for the magnitu de of assembly 
bias seen in simulations. IMo et a l. (2005) suggested that 
"pre-heating" , arising from the partial collapse of struc- 
ture into sheets and filaments, could suppress mass accre- 
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tion onto halos within those sheets and filaments. This 
mechanism would clearly be incapable of explaining as- 
sembly bias in high-mass halos, w hich collapse befo re the 
formation of sheets an d filaments (|Bond et al.ll 1996h , and 
ISandvik et al.l (|2007f ) found that this mechan ism fails 
in low-mass halos as well. iDesiacquesI (120071) investi- 
gated the effects of environment on the iBond &: Mversl 
(1996) ellipsoidal collapse model, but could not repro- 
duce the sign reversal of assembly bias between high and 
low masses, or the magnitude of assembly bias seen in 
N-body simulations. 

In this paper, we will argue that many aspects of as- 
sembly bias may be understood quite simply. In par- 
ticular, we show that the statistics of Gaussian random 
fields require significant assembly bias in high-mass ha- 
los, and we show using N-body simulations that the clus- 
tering of massive halos matches precisely the assembly 
bias that we predict. At low halo masses, we argue that 
assembly bias arises largely due to the cessation of mass 
accretion onto a sub-population of small halos. These 
non-accreting halos then naturally become unbiased over 
time, and therefore cluster more strongly than the ma- 
jority of halos of similar mass, which are anti-biased. 

2. BIAS 

In this section, we briefly review previous results on 
biasing and assembly bias. 

We start by assuming that massive dark matter ha- 
los arise from the collapse of peaks in the initial (lin- 
ear) density field; this allows us to infer many of the 
properties of halo statistics using the statistics of gaus- 
sian fields (IPress k Schechterlll974t iBardeen et al.l ll986l: 
IBond et al l 119911 : IBond fc Mversl I1996D . High peaks of 
the density field will in general cluster more strongly than 
the average matter clustering, which leads to halo bias. 
In the linear bias model, we directly relate the halo fluc- 
tuations to the local matter density fluctuations, 



becomes 



S h = b5, 



(1) 



where S = 5p/p is the overdensity and b is the bias pa- 
rameter. On scales over which this expression is valid 4 , 
we may express halo two-point functions in terms of the 
matter two-point function, scaled by appropriate powers 
of the bias. For example, the halo power spectrum would 
become Ph(k) = b 2 P m (k). 

The clustering of halos reflects the clustering of ini- 
tial peaks only in part, however. Another contribution 
to halo clustering comes from the motion of peaks along 
large-scale matter flows. At late times, the halo overden- 
sity becomes 

S h (a) =6 h + S m (a) (2) 

where the Lagrangian overdensity 5l corresponds to the 
clustering of the peaks in the initial density field, and 
the second term 5 m reflects the motion of the peaks due 
to advection along matter flows. On sufficiently large 
scales, in which we can treat the motion of the peaks 
as tracing the average bulk motion, the second term is 
simply the matter overdensity. Then the (Eulerian) bias 

4 This approximation should be valid on scales where the matter 
corre lation is small, £(r) < 1 or A 2 (k) = k 3 P(k)/2ir 2 -C 1 IIKaiserl 
1984). In general the scale-dependence of the bias will extend to 
larger scales the more biased the halos. 



b(a) = 5 h {a)/8 m (a) 
= l + b h (a). 
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Note that we explicitly include the time dependence of 
the matter overdensity, but that the Lagrangian over- 
density of peaks is not time dependent: it simply reflects 
the clustering of peaks in the initial conditions. There- 
fore, over time, the clustering associated with motion will 
dominate over the clustering from the initial conditions. 
That is, if we can treat the halos as test particles, we ex- 
pect their Lagrangian bias to decay with time inver sely 
with the linear growth factor, bh(a) oc 1/D(a) (e.g. iFrvl 
[l^lTegmark fc PeeblesHl998l ). 

Henceforth, we will focus on the Lagrangian bias &l, 
noting that the conversion to Eulerian bias b is simply 
given by Eqn. on linear scales. The next step is to 
determine &l- From Eqn. (|TJ) , we can think of bias as 
the rate at which the number of halos or peaks changes 
as the background density is varied. If we associate dark 
matter halos with rare peaks above some high threshold 
density, then it becomes easy to see why massive halos 
are biased. In the Press-Schechter model, for example, 
the (Lagrangian) number density of halos is 



nps oc exp 
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and changing the background density level by S is equiv- 
alent to changing the threshold density to 5 C — S. Then 
the bias becomes 
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()Cole fc Kaiserl fl989l) . This precise expression applies 
only for the Press-Schechter mass function, but its qual- 
itative features are generally valid. At high masses 
(er <c (5 C ), peaks above threshold are much more clustered 
than matter, because changing the background density 
level dramatically changes the probability out on the tail 
of the Gaussian distribution. In contrast, at low masses 
(a ^> <5 C ), halos are anti-biased (&l < 0) because rais- 
ing the background density level reduces the number of 
peaks at low mass, by converting them to higher mass. 

Clearly, many aspects of the mass dependence of bias 
may be inferred by consideration of the initial density 
peaks from which halos form. We can largely understand 
the clustering of Eulerian halos found in complex nonlin- 
ear simulations by adopting a Lagrangian viewpoint, and 
examining the properties of peaks of the initial linear 
density field. As mentioned in the introduction, simu- 
lations have shown that bias depends upon additional 
halo properties besides mass, such as assembly history 
or structural parameters like concentration. The philos- 
ophy of this paper will be to assume that this additional 
dependence may also be understood from the statistics of 
the initial Gaussian peaks. Accordingly, we will study N- 
body halos found in simulations described below, using 
a Lagrangian viewpoint. 

3. SIMULATION AND POST-PROCESSING 

We have performed N-body simulations of structure 
formation in scale-free cosmologies, with Q m = 1 and 
power-law power spectra. We have chosen to run scale- 
free models rather than more realistic ACDM models in 
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order to simplify the problem as much as possible, and 
remove possible effects from late-time cosmic accelera- 
tion or a varying spectral index. On scales far from the 
box size and far from the Nyquist frequency, the only 
relevant scale in the problem is M*(a), the characteristic 
nonlinear mass scale satisfying a(M±,a) = 6 C = 1.68. If 
the power spectrum slope is n, such that P(k) oc k n , then 
the rms mass fluctuations scale like a(M) oc M~( 3+ ™" 6 . 
Because the linear growth factor behaves as D(a) = a 
for this critical-density cosmology, the nonlinear mass 
scales like M+(a) oc a 6 ^ 3+n K Since the nonlinear mass 
varies rapidly with expansion factor a, by examining halo 
properties at multiple redshifts w e can study halo prop- 
erties over a wide ran ge of M/M* (|Efstathiou et alj|1988t 
iWechsler et aDl2006h . 

In this paper, we will focus on the largest simulation we 
ran, with 1024 3 particles, using a power spectral index 
n = — 2. We normalized the power spectrum such that 
a(M,a) = WaM- 1 / 6 , where the mass M is measured in 
number of particles. The simu lation was evolved using 
the a daptive P 3 M code gracos (IShirokov fc Bertsc hingcr 
2005), using a Plummer softening length ep =0.1 times 
the mean interparticle spacing. The simulation was 
started at a = 0.005, and 100 outputs were generated, 
spaced evenly in log a from a = 0.0825 to a = 1, con- 
taining phase-space coordinates (comoving positions and 
peculiar velocities) and ID's for each particle. 

The phase space data for each output is used to find 
the (sub-)halos in a two-step process. First we generate a 
catalo g of halos using th e Friends-of-Friends (FoF) algo- 
rithm ( Davis et al.lll985T ) with a linking length of 6 = 0.2 
times the mean inter-particle spacing. This procedure 
partitions the particles into equivalence classes by linking 
together all particles separated by less than a distance b, 
with a density of roug hly p > 3/(2?r& 3 ) ~ 100 times the 
background density. We keep all halos with more than 
32 particles. 

We identify subhalos of the FOF grou ps using a new 
implem entation of the SubRnd method of lSpringel et all 
(2001) which defines subhalos as gravitationally self- 
bound aggregations of particles bounded by a density 
saddle point. After some experimentation with different 
techniques we found this method gave a good match to 
what would be selected "by eye" as subhalos. We use 
a spline kernel with 16 neighbors to estimate the den- 
sity and keep all subhalos with more than 20 particles. 
We call the most massive subhalo in any FoF group the 
central halo. The other subhalos we refer to as satellites. 

For each subhalo we compute and store a number of 
properties including the bound mass, velocity dispersion, 
peak circular velocity (v c ), total potential energy, posi- 
tion and velocity. Since mass loss can be quite extreme 
for some subhalos we will use v c rather than mass to 
quote subhalo 'size'. The mass, peak circular velocity 
and potential are highly correlated, however since our 
subhalos are not spherical, there is a non-trivial scatter 
between v c and mass at fixed time. 

A merger tree is computed from the set of subhalo 
catalogs by identifying for each subhalo a "child" at a 
later time. We process 4 consecutive simulation outputs 
at a time and for each subhalo in the earliest output 
we define its child as the subhalo at a later time which 
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where a\ and M\ are the scale-factor and mass of the pro- 
genitor, a,2 and M 2 are the candidate halo at the later 
time, 4>u is the potential of particle i computed using 
the particles in subhalo 1 and the sum is over all par- 
ticles in the progenitor that also lie in the candidate. 
The weighting by (j> 2 ensures that the "core" particles 
in the progenitor are given more weight than the less 
bound particles. We found that weighting by </> 2 gave 
better results than weighting by </>, but higher powers of 
4> did not perform appreciably better than <f> 2 . We find 
~ 95% of the subhalos have a child in the next time step, 
with one or two percent skipping one or more output 
times. A sim ilar method for subha l o tracking has been 
emplo y ed by ISpringel et"ail (I2005D. iFaltenbacher et al.l 
(l2005lf.lKravtsov et all (|2004l ). lAllgood et alJ (|2006l ) . and 
lHarker et alJ ([2001 . 

In addition to the Eulerian halo properties mentioned 
above, we also require Lagrangian properties of halos, 
such as their centroids or mean initial velocities. From 
the initial conditions for the simulation, we have the ini- 
tial density field S(x), the initial comoving displacement 
field d(x) satisfying 5 + V • d = 0, the peculiar velocity 
field, which in the Zeldovich approximation for fi TO = 1 
is v = aH d, and the strain field S = Vd. Note that 
Tr S = —S. These fields are all computed on the same 
grid as the initial (undisplaced) particle positions, by 
Fourier transform of the initial density field. For exam- 
ple, the strain field Sij is computed by FFT of Skkikj/k 2 . 
We would like to average these quantities over the La- 
grangian volumes occupied by the halos. We may do so 
using the unique ID's for each particle. For each field F, 
we compute the average (F) over every halo by a simple 
average over the halo's particles: 
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(F) i = Nr^F(x itj ), 



(7) 



where Xij is the Lagrangian position of the j parti- 
cle of halo i. Lastly, we will also require derivatives 
of these smoothed properties with respect to mass, e.g. 
d(S) / d(\og M) . We use the merger tree described above 
to compute derivatives with respect to mass. For ev- 
ery halo, we move along the most massive branch of its 
merger tree for a factor of 2 in mass, tabulating mass 
M and the average (F). We then fit a linear expansion, 
F = Fq + F' A(log M) in order to measure the derivative 
F' = dF/dlogM. 

Lastly, we will determine bias parameters in Fourier 
space. We measure the matter power spectrum P(k), 
the halo auto-power spectrum Ph(k), and the halo- 
matter cross spectrum P c (k), and define the halo bias as 
b = P c (k)/P(k). We use the ratio of the cross-spectrum 
rather than the halo auto-spectrum, because the former 
is less sensitive to shot noise in the halo counts. 

4. HIGH MASS HALOS: M > AL 

We first focus on halos well above the nonlinear mass 
scale, M 3> M*, where a(M) <C S c . These halos are par- 
ticularly simple : because they are so massive, they dom- 
inate their environments, and because they are far out on 
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Fig. 1. — Average Lagrangian overdensity (8) for halos as a 
function of a(M). In the high mass limit, a — » 0, collapse becomes 
spherical and (S) — > S c = 1.68. Point colors correspond to peak 
curvature, d(8) /d log M , as denoted in the colorbar on the right. 
The solid black curve and error bars depict the mean and standard 
deviation of 8 in bins of a. 

the tail of the probability distribution of 5, the mass de- 
pendence of their bias scales like b ~ S c /a 2 . Anti-biasing 
from larger-scale halos should be negligible. Another key 
simplification in this regime is that the collapse of these 
rare peaks is nearly spherical (jBardeen et al.|[l986t Ap- 
pendix C). Because the collapse is roughly spherical, we 
expect the coll apse threshold predi cted by the spherical 
collapse model (|Gunn k, Gott|l972l ) to be a good approx- 
imation: S c ps 1.68. This expectation is borne out by our 
N-body results. In figure [U we plot the average linearly 
evolved overdensity for FOF halos well above M*, and 
indeed we find (5) ~ 1.7 for the > 3a peaks. Note that 
multiple different redshifts and halo masses are plotted 
in this figure. 

The validity of 5 C ps 1.68 for halos in the regime 
a < 1 allows us easily to estimate the halo bias as 
b ~ 8 c /a 2 as mentioned above. This is the average 
over all peaks of a given mass, which marginalizes over 
any other peak parameters which could affect bias. One 
obvious additional parameter which the bias must de- 
pend upon is the curvature of the peak, which we will 
parametrize by s = d{5) / dilog M) . To see this, note 
that we can estimate the large-scale background den- 
sity Sf, in which the halos reside by a Taylor expansion: 
5 b ps 6 + A(logM) x dS/d(\ogM) + . . .. Two peaks of 
the same mass M and peak height 5 but different peak 
curvatures si > S2 (i.e. \si\ < |sa|) will tend to live in 
different environments: peak 1 will tend to have a larger 
background density than peak 2, and therefore will have 
a larger bias. It is straightforward to estimate the depen- 
dence of bias on curvature. Writing v = 5 /a, x = s/a s , 
and the cross-correlation coef ficient 7 = (vx), the bias 
becomes (jBardeen et al.lll986l ) 5 



We can easily sketch a d erivation of th is expression fol- 
lowing the argument of Kaiserl (|1984h . Let us com- 
pute the correlation function for regions above a high 
threshold v t . Consider regions 1 and 2, each of size R 
and separated by distance r 12 , with peak heights V\,vi 

5 IBa rdccn et al. (1986) parametrized curvature not with 
dS/dlogM as we do, but rather with the closely related V 2 <5. 
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Fig. 2. — Eulerian bias b E = 1 + fe L for halos with 110-130 par- 
ticles at a = 0.1089, as a function of peak curvature. These halos 
have a = 0.49, and are roughly ~ 1600A/* at this redshift. Halos 
are binned into quartiles of dlog 8/d log M, and halo bias in each 
bin is measured by the ratio of the halo-matter cross-spectrum to 
the matter power spectrum. The solid black curve shows Eqn. JSJ. 



and curvatures x\,X2- These Gaussian fields are de- 
scribed by a covariance matrix with diagonal elements 
(v 2 ) = (x 2 ) = 1. Of the off-diagonal elements, only 
{V1V2) = ^(712) and (viXi) = (^2^2) = 7 are important; 
long-range correlations involving derivatives of the den- 
sity field fall off more quickly than these terms by powers 
of R/r 12. We compute the two-point correlation of re- 
gions with v > vt by integrating the Gaussian probability 
over V\ > vt, v-i > v t . If we are uninterested in the cur- 
vature x, then we first marginalize over X\ and X2, leav- 
ing a 2 x 2 covariance matrix with off-diagonal element 
ip, and we find that the two-point function for regions 
above threshold becomes £ D k = v 2 ^, giving a Lagrangian 
bias &l = vt/cr ()Kaiserlll984[ ) . We can similarly compute 
the two-point function for regions above threshold with 
a specified curvature x by repeating this calculation, but 
not marginalizing over x\ and X2- Recalling that the 
one-point probability distribution for v at a given x is a 
Gaussian centered at (v\x) — jx with variance 1 — 7 2 , 
we change variables from v to v' = (y — jx)/ (1 — 7 2 ) 1 / 2 . 
Noting that (v^v^) ~ — j 2 ), we immediately see 

that the two-point correlation function for regions above 
threshold becomes £ p k = {v' t ) 2 ijj/(l ~~ 7 2 ), giving a bias 

6 L = ^'/(av/1-7 2 ), i.e. Eqn. ®. 

The bias of halos found in our simulation matches 
Eqn. {8} quite well. For a power spectrum index n = —2, 
as we have used, note that <r s = <r/\/6 and 7 = — l/v6, 
for a top-hat window function. In figure [2] we plot the 
halo bias as a function of peak curvature, along with our 
prediction from Eqn. (8|). There is evidently a strong 
dependence of bias upon peak curvature. The ~ 40% 
range in b corresponds to a factor of 2 variation in the 
correlation function for halos of this mass. 

One subtlety which we have glossed over in computing 
the statistics of high-mass peaks is the issue of the ap- 
propriate smoothing to use. We have assumed a top-hat 
smoothing, however there is no compelling theoretical 
reason to favor a top-hat kernel versus other possibili- 
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Fig. 3. — Probability for a particle to be linked into a high mass 
FOF group as a function of initial Lagrangian radius. The thick 
solid black curve shows the stacked Lagrangian volumes for halos 
in the mass range 200-300 particles at a = 0.1089, corresponding 
to roughly ~ 3000M* . Note that mass in subhalos is excluded from 
this average. The blue dotted curve shows the expected average 
profile for a tophat window, while the red dashed curve corresponds 
to a Gaussian of this mass range. The thin solid black curves show 
the average Lagrangian profiles for halos of mass between 200-300 
particles at later outputs, a = 0.1896,0.3300,0.5547, and 1.0000, 
corresponding to M/M+ ~ 100,4,0.2 and 0.005 respectively. 

ties, such as a Gaussian (although the sharp fc-space filter 
implicit in the Press-Schechter model appears unphysi- 
cal). As a sanity check on our top-hat smoothing, we 
have computed the average Lagrangian volumes corre- 
sponding to high peaks, by stacking roughly 1000 halos 
of mass ~ 3000 M*. The results are shown in figure [3l 
The average Lagrangian volume occupied by FOF halos 
is reasonably compact, with the probability for halo in- 
clusion steeply varying from near 1 to near over a range 
0.8 < 7'th < 1-2, where tth is the top-hat smoothing ra- 
dius for mass M. In comparison, a Gaussian smoothing 
window is clearly more diffuse. We conclude that our 
use of top-hat smoothing is reasonable, though we cau- 
tion that these results may depend upon the slope of the 
power spectrum. Finally, we note that there is a nontriv- 
ial amount of mass inside r < tth that does not end up 
in the halo; we will return to this point below in section 

El 

4.1. Halo observables 

We have shown that there is a strong dependence of 
halo clustering upon peak curvature. How does this re- 
late to the assembly bias observed in previous simula- 
tions? 

In the high mass regime, there is a direct relationship 
between peak curvature and halo assembly history. Re- 
call that in the spherical collapse model, regions com- 
plete their collapse once the average linearly evolved 
overdensity 8 reaches S c . This implies that we can es- 
timate the assembly history of halos simply by mea- 
suring how the smoothed overdensity (S) varies with 
smoothing scale. We would expect the accretion rate 
and the curvature to be related by d(log M) / <f (log a) — 
— [d (log <5)/d(log A/)] -1 . The average mass accretion his- 
tories for high mass halos appears consistent with this ex- 
pectation. In figure [4] we plot the stacked mass accretion 
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Fig. 4. — Mass accretion histories as a function of peak cur- 
vature. The upper solid (blue) curve shows the average accre- 
tion history for the halos comprising the leftmost bin in Fig. [2] 
while the lower solid (red) curve shows the average history for the 
rightmost bin. The dotted curves depict the behavior M oc a a 
for a = — l/(d log 8/d log M ) expected from the spherical collapse 
model. 

history profiles for halos of 120 particles at a = 0.1089, 
splitting the population based on peak curvature. The 
average accretion rates are in good agreement with our 
expectations. This is true only in the average, however 
: individual accretion histories are noisy because much 
of the accretion onto halos is not smooth, but rather 
consists of discrete accretion events. This may explain 
why different age ind icators for halos gi ve different lev- 
els of assembly bias (| Wetzel et al.ll2007[ ). The noisiness 
of commonly used age indicators may be large enough 
to mask the strong assembly bias clearly present in high 
mass halos. 

Another hal o parameter seen to c orrelate with bias is 
concentration (|Wechsler et al.l 12006). While we cannot 
write down an expression directly relating peak curva- 
ture and halo concentration, nevertheless we would ex- 
pect curvature and concentration to be correlated. Pre- 
vious work has shown t hat assembly history and concen- 
tration are correlated (jWechsler et al.l 120021 : IZhao et all 
l2003al |bT). so the relation between peak curvature and 
accretion rate discussed above suggests that concentra- 
tion and peak curvature may be correlated. Physically, 
we would expect that highly curved peaks should pro- 
duce more concentrated halos. To test this, we plot in 
figure [5] the stacked radial profiles of halos with 4000- 
5000 particles at a = 0.2501, as a function of Lagrangian 
peak curvature. There is a correlation between initial 
peak curvature and final halo concentration apparent in 
the stacked profiles, although it is quite apparent that 
peak curvature is not equivalent to concentration. While 
the inner profiles (at r < r s ) of curvature-selected halos 
match well with profiles of concentration-selected halos, 
the outer profiles (at r > r s ) begin to diverge. 

Within the statistical errors of our finite sample of rare 
halos, we have found no residual dependence of bias on 
concentration, once we account for the peak curvature 
dependence described above. Our simulation had rela- 
tively low force resolution, and so our fitted concentra- 
tions may be sufficiently noisy to mask any underlying 
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Fig. 5. — Radial profiles for halos as a function of peak curvature. 
The various curves show the stacked radial profiles for subsets of 
halos of 4000-5000 particles at a = 0.2501, roughly M ~ 400Af*. 
The solid curves correspond to the upper and lower quartiles of 
d log 5/d log M, while the dashed curves correspond to the upper 
and lower quartiles of fitted concentration C2oo- For comparison, 
the vertical dotted black line shows r = 2.8ep, the softening length. 



residual concentration dependence of bias, but it appears 
that assembly bias in rare, high mass halos is in agree- 
ment with theoretical expectations. The interpretation 
of this effect is straightforward: halos residing in high 
density environments have an enhanced mass supply, and 
therefore grow faster, than halos in low density environ- 
ments. 

5. LOW MASS HALOS: M < M* 

In this section we consider assembly bias for low mass 
halos with M < M*, for which <x(M) > S c . The age- 
and concentration-dependence of bias at low masses is 
considerably stronger than in the high mass regime, and 
of the opposite sign: the oldest, highly concentrated 
halos have a bias more than twice as large as that of 
the young est, low-concentration halos of similar mass 
M «: dGao. Springel fc Whitd l2CM lWechsler"eTaT] 
l2TM iGao & Whitell2007<r ~ 

In agreement with previous workers, we also find this 
reversed sense of assembly bias. In keeping with our La- 
grangian viewpoint, we use mean Lagrangian overdensity 
(S) as a proxy for age. In figure El we plot average ac- 
cretion histories for halos of 200 particles at a — 1, split 
into quartiles of (S) . The oldest halos show little growth 
at late times, accreting only ~ 20% of their mass af- 
ter a = 0.5. In comparison, the nonlinear mass scale 
increases by a factor w 64 over this same time. 

Because the oldest halos do not grow appreciably in 
mass, we can think of their dynamics as those of test 
particles following large-scale bulk flows. As discussed in 
section [2 we therefore expect the bias of this population 
to relax over time to b — » 1. This is indeed the behav- 
ior observed for our oldest halos, as shown in figure El 
which plots the (Eulerian) bias of the same halos shown 
in figure El The oldest ~ 20% of halos are nearly un- 
biased, while the remainder are strongly anti-biased, to 
a degree consistent with predictions of ev en the simplest 
Press-Schechter model, b — » l — S^ 1 « 0.4 (Co le &: Kaiser! 




Fig. 6. — Mass accretion histories for halos of 190-210 particles 
at a = 1, corresponding to M = 0.0045M*. The halos are split into 
quartiles of (8) , which correlates well with mass accretion history. 




Fig. 7. — Eulerian bias for the same halos of ~ 200 particles 
shown in Figure [6] The oldest halos, at high (<5), are nearly unbi- 
ased while the rest of the population is strongly antibiased. Remov- 
ing all halos which previously were subhalos, and all halos bound to 
more massive halos, removes the old unbiased subpopulation (red 
dashed curve). 

[Mi). 

The magnitude of assembly bias present at low masses 
is therefore not surprising. Most low-mass halos are anti- 
biased at expected levels, while a subpopulation becomes 
unbiased. The theoretical challenge is to explain the 
number of unbiased, low-mass halos. 

One hint towards the origin of the non-accreting, unbi- 
ased population comes from the observation that many, 
if not most of t hese halos are as sociated with nearby 
larger halos (e.g. lWang et al.ll2007f) . In figure[71 we again 
plot the bias of ~ 200 particle halos, but excluding those 
which in earlier outputs were subhalos. This removes 
~ 20 — 25% of the total population. Because the time 
sampling of our simulation outputs do not finely cover 
the dynamical times of all halos, this cut could miss sub- 
halos with orbital pericen ters inside of l arger halos, but 
apocenters well outside. iLudlow et alj (|2008t ) have re- 
cently argued that a significant fraction of subhalos have 
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Fig. 8. — Probability distribution for ambient 3-D velocity dis- 
persion <r within a radius 5r200 of halos with 200-300 particles at 
a = 1, roughly 0.005 M*. Halos are split into 5 bins of (5), with 
red, magenta, green, cyan, and blue from lowest to highest respec- 
tively. The oldest halos tend to live in hot environments, whereas 
the material in the vicinity of most low-mass halos is sufficiently 
cold to permit accretion. 

orbital apocenters extending out to many times the virial 
radius of their host halo. To account for this, we also 
conservatively exclude all halos which are within 3 r v j r 
and gravitationally bound to large r halos. This will ex - 
clude the population discussed by iLudlow et alj (2008), 
as well as any halos entering the infall region of massive 
objects. This cut removes a further ~ 15% of low mass 
halos, leaving > 60% of the original low-mass popula- 
tion. The effect of these cuts is effectively to exclude the 
old subpopulation, even though the cuts do not explic- 
itly refer to halo age or accretion history. It therefore 
does seem likely that, as suggested by previous authors, 
the old subpopulation of low-mass halos corresponds to 
early-form ers whose growth w as stunted by environmen- 
tal effects (|Wang et al.ll2007t) . 

One plausible environmental effect that would account 
for the arrested development of the oldest low-mass halos 
is simply the enhanced velocity dispersion in the vicinity 
of massive halos. In environments where the ambient 
velocity dispersion greatly exceeds the escape velocity 
of low-mass halos, those halos will be unable to accrete 
significant material. We indeed find that the oldest halos 
preferentially reside within hot environments, as shown 
in Fig. E 

Given that the oldest low-mass halos tend to exist in 
proximity to larger halos, we would expect them to orig- 
inate (in Lagrangian space) from the vicinities of high 
mass halos as well, c.f. Fig. [31 As discussed in the previ- 
ous section, much of the material within the Lagrangian 
volumes of high mass halos ends up within the final col- 
lapsed halo, so how do we understand the survival of 
the old low-mass halos in the presence of their rapidly 
accreting neighbors ? One possible explanation noted by 
ILudlow et alJ (|2008l ) is that these halos are ejected out of 
larger halos by 3-body interactions. Because our goal is 
to understand the clustering of halos in terms of their La- 
grangian properties of known (Gaussian) statistics, and 
since complicated multi-body interactions are clearly dif- 
ficult to model using only Lagrangian quantities, we have 
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Fig. 9. — Binding energy versus initial radius for peaks that 
are accreted (red), that become subhalos (green), or that escape 
(blue). 

not attempted to address this mechanism using our sim- 
ulations. Rather, we have tried to find Lagrangian prop- 
erties of halos which correlate with their survival around 
larger structures. 

One Lagrangian quantity which correlates with sur- 
vival is plotted in Figure [9] We have taken halos of 
mass 40,000-60,000 particles at a = 1, nearly M ~ M*, 
and identified smaller-scale peaks within the spherical 
Lagrangian volumes centered on the larger halos. As 
shown in the figure, the relative binding energy of the 
subpeaks roughly correlates with their escape likelihood. 
We have defined the binding energy SE in the following 
manner. The gravitational potential near a peak may be 
written 



(x) w 0o + x ■ V(p + \^x ■ VV0 • x - 



(9) 



The first two terms do not affect the collapse of the peak, 
and so we define the perturbed potential energy as the 
quadratic term, 



1 



x) = —x ■ VV0 • x 



4a 



(10) 



where we have used the Poisson equation to relate the 
strain tensor S = VV(V~ 2 <5) to the potential's curvature 
tensor. We can similarly write the kinetic energy near the 
peak as 



T(X): 



:-\v\ 

2 1 1 



-\aH(x 
2 1 V 



:-{aH\x\f 



d) + Vj 
2(aH) 2 x d 



(11) 



where we have used the Zeldovich approximation for 
tt m = 1 to relate the peculiar velocity and comoving 
displacement, v p = aHd. The unperturbed piece again 
does not affect the peak dynamics, and so we write the 
perturbed kinetic energy as 



ST = 2(aH) 2 x ■ d, 



(12) 



and the total perturbed energy becomes SE = 5(f) + ST. 
In computing 5(f), we smooth the strain tensor S over the 
large halo, whereas in ST we use the difference in the 
mean Lagrangian peculiar velocities, Aul = «L2 — i>li- 

As can be seen from Figure |H1 the initial Lagrangian 
binding energy does not solely determine the survival of 
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subpeaks; the escape threshold is radially dependent as 
well. Without any physical justification, we find that an 
approximate threshold 

E' = 6E + (aH) 2 [2r 2 - r s 2 m ] = (13) 

roughly separates accreted subpeaks from escaping sub- 
peaks. 

Our assumption, therefore, is that subpeaks inside of 
larger peaks can escape accretion onto their hosts if the 
relative energy is large enough, E' > 0. Due to environ- 
mental effects, however, they are unable to grow, and 
thereby produce the population of old, unbiased low- 
mass halos. In the next section, we test how well this 
simple picture can reproduce our numerical bias results. 

6. A TOY MODEL FOR ASSEMBLY BIAS 

In the previous two sections, we have discussed assem- 
bly bias in two asymptotic regimes, a <C 1 and a ^> 6 C . 
We found that two distinct effects generated assembly 
bias in these two regimes. At intermediate masses, a ~ 1, 
both effects can be significant, and the overall bias of a 
sample of halos will involve a competition between these 
two effects. The transition between the high-mass behav- 
ior and the low mass behavior will depend in detail upon 
the precise selection criteria used to define the sample of 
halos. For example, we plot in Figure [10] the halo bias for 
halos selected on peak height (5) and on peak curvature 
d\og(S) / dlog M. Recall that these peak properties both 
affect halo properties like age and concentration. At high 
masses, the scatter in peak height among peaks above 
threshold is small, and so the scatter in peak curvature 
dominates the assembly bias. At low mass, the peak 
height is the dominant factor determining halo age or 
concentration, causing a reversal in the sense of assembly 
bias. To illustrate, we plot in the right panel of Fig. [TU] 
the bias for halos selected on (6)/S c —d\og(5)/d\ogM. At 
high masses, the curvature term dominates, while at low 
masses, the first term dominates. Halo samples selected 
on formation time or concentration show similar behav- 
ior. At intermediate masses, both of these effects are 
significant and compete with each other, and the magni- 
tude and sign of assembly bias will depend in detail upon 
the precise age indicator that is employed. For example, 
an age indicator that correlates more strongly with peak 
height rather than peak curvature will show a crossover 
at higher masses than the example shown in Fig. 110b . 
This may explain why different measures of halo age give 
different amounts of as sembly bias w hen applied to the 
same samples of halos t|Li et al.ll2008|) . 

Using our results on the assembly bias of low- and 
high-mass halos, we have constructed a simple toy model 
for halo biasing. We generate a Gaussian random den- 
sity field, smooth the field on multiple scales, and start- 
ing from the largest smoothing scales, search for den- 
sity peaks exceeding the ellipsoidal collapse threshold 
(|Bond & Mverslll996l . hereafter BM96). Isolated peaks 
above threshold are labeled as collapsed halos, whereas 
peaks falling within 1.2 smoothing radii (c.f. Fig. [9|) of 
larger scale collapsed regions are only labeled as collapsed 
if their binding energies relative to their larger hosts ex- 
ceeds the threshold found empirically from our N-body 
halos, i.e. Eqn. (fl3|) . 

A key component of this calculation is the collapse 
threshold. A comparison of the simple BM96 model 



with our N-body results shows that the model works 
well at predicting collapse thresholds for reasonably high 
peaks, a < 1, however at low peaks ir> 1, the model 
begins to break down. Very roughly, the ellipsoidal 
collapse threshold is larger than the spherical collapse 
threshold by a ratio S CC /5 SC ~ (1 — ^scCu) -1 : where 
the ellipticity e v of the strain tensor is given by the 
difference between its largest and smallest eigenvalues, 
e v = (S33 — Sn)/2d. The ellipticity and prolaticity are 
random variables wh ose probability distr ibution is re- 
lated to peak height (jBardeen et all 1 19861 BM96), but 
very roughly the mean ellipticity scales as (e v ) ~ l/{2v). 
For v < 1, the BM96 collapse threshold quickly grows, 
and for sufficiently large e v and — p v the model predicts 
that collapse never occurs. However we find many low 
mass halos with average peak heights (5) falling far be- 
low the collapse thresholds expected given their average 
Lagrangian strain tensors (S). This is not surprising. As 
noted explicitly by BM96, their homogeneous ellipsoid 
model assumes that the evolution of the external tidal 
field acting upon the collapsing peak may be determined 
from the strain tensor (S) averaged over the local volume 
of the peak. While it is certainly true that the exter- 
nal tidal field and local strain tensor are closely related 
at early times, in the nonlinear regime the one-to-one 
correspondence between them breaks down. On scales 
where a ^> 1, the external tidal field evolves nonlinear ly 
in a manner that cannot be predicted from purely lo- 
cal measurements over the peak volume. Therefore, we 
would not expect the simple BM96 model to apply in 
this regime, as those authors note. 

Nevertheless, we require some choice of collapse thresh- 
old for our toy model. At low masses, the lowest (5)'s 
are roughly ~ 2.5 (see Fig. [7]), so we arbitrarily set a 
threshold S c = min((5BM96j 2.5). In addition, we also re- 
quire that subpeaks collapsing inside of larger peaks must 
complete their collapse sufficiently before the larger peak 
collapses. We can estimate the collapse redshift z c by 
1 + z c = (S)/S c . Writing z s and Zb as the collapse red- 
shifts for the sub-peak and large peak, respectively, we 
require (1 + z s ) > 2(1 + Zj,) for a subpeak to be labeled 
as a collapsed halo. 

Figure QT] shows results of this simple calculation. In 
order to compare with Fig. [LQ\ we plot bias as a function 
of the same peak parameters. At high masses, a « 1, 
the bias matches Eqn. as it must. At low masses (a ^> 
1) the bias of isolated peaks asymptotes to &e ~ 0.4, 
while the subpeaks become unbiased, &e — ► 1- These 
features appear fairly generic. Behavior in the transition 
region, a ~ 1 — 2, depends upon details of the model. In 
particular, the bias of the old halos in the intermediate 
mass regime depends upon the fraction of halos that are 
subpeaks, and varying parameters in the model changes 
this fraction. 

Overall, however, we find reasonable agreement be- 
tween the toy model and our N-body simulation in the 
regimes where assembly bias is significant, for M 3> M* 
and M <C M*. The point of this exercise is not to 
claim that we have solved the halo formation problem, 
but merely to illustrate that the assembly bias in these 
regimes is generic. The assembly bias at high masses is 
an unavoidable consequence of Gaussian statistics, and 
therefore any halo formation model which relates ini- 
tial density peaks to halos will also find similar levels 
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Fig. 10. — Bias as a function of peak parameters. Bias is plotted for halos in the range 200-300 particles at various redshifts. In the left 
panel, the red and blue curves show bias for the upper and lower 20% of peak height (S)/S c , where 8 C is the ellipsoidal collapse threshold 
calculated from each halo's local strain tensor. In the middle panel, we split halos on peak curvature, d log(5)/d log M. In the right panel, 
we split halos on an average of these two quantities, (S)/8 C — d log(5)/d log M. At high masses, the scatter in curvature dominates, while 
at low masses, the scatter in peak height dominates, leading to a transition in assembly bias at intermediate masses. Formation time or 
concentration shows similar behavior. 
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Fig. 11. — Similar to Fig. 1101 but for peaks found in our toy model calculation rather than N-body halos. 



of assembly bias, depending somewhat on the peak def- 
inition. The behavior at low masses may appear more 
model dependent, however we do not believe our results 
are specific to our particular mechanism for production of 
non-accreting halos. Rather, we expect that any mecha- 
nism which produces roughly the correct fraction of non- 
growing, low-mass halos will also match the assembly 
bias found in N-body simulations. 

7. DISCUSSION & CONCLUSIONS 

We have investigated halo assembly bias in hierarchical 
structure formation. We find that assembly bias is sig- 
nificant both at high masses (M ^> M+) and low masses 
(M <C M*). Two different competing effects generate as- 
sembly bias in these two regimes. At high masses, peaks 
of low curvature are more strongly clustered than peaks 
of large curvature, in a manner entirely consistent with 
the statistics of peaks of random Gaussian fields. We 
provide a simple expression for the bias in the high-mass 
regime which accounts for assembly bias and reduces to 
the Press-Schechter bias when marginalized over all pa- 
rameters excluding mass (note that fitting formulae for 
bia s calibrated at lower m ass, such as those p r opose d 
bv ISheth fe Tormenl (fl999h or ISeliak fe Wairenl pOOl . 
fail in the high mass regime (|Cohn fe Whitdl2007ri ). At 
low masses, assembly bias appears largely driven by the 
formation of a non-accreting sub-population of low-mass 



halos in the vicinity of larger halos. Because the non- 
accreting halos become unbiased, they are more strongly 
clustered than accreting halos of similar mass, which are 
anti-biased. We found that a simple toy model is able 
to reproduce our numerical bias results in the asymp- 
totically large and small mass regimes. At intermediate 
masses (M ~ M*), both of the above effects become im- 
portant, and the assembly bias for a sample of halos will 
involve a weighted average of both effects. 

The application of these results to observed objects 
like galaxies or quasars is not straightforward, because 
it will depend upon the halo occupation distribution 
(HOD) of those objects. The very simplest HOD mod- 
els assume that the galaxy content depends only upon 
halo mass, and not upon assembly history, whereas semi- 
analytic models used in conjunction with N-body simu- 
lations tend to find a strong dependence of galaxy prop- 
erties on assembly histor y at fixed mass (|Springel et al.l 
120051: iCroton et all 120071 e.g.). If the former assump- 
tion is valid, then assembly bias will not be important 
for ga laxy clustering. Perhaps surprisingly. iTinker et all 
(2007) have recently found that properties of observed 
SDSS galaxies do not appear to correlate significantly 
with local environment, once the dependence on host 
halo mass has been accounted for, suggesting that as- 
sembly bias may indeed be unimportant for modeling 
galaxy clustering. 
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Besides the clustering of halos, our results also have 
implications for our understanding of halo assembly. In 
particular, we have found that massive halos are very 
well described by spherical collapse. The average pre- 
collapse Lagrangian volumes occupied by the most mas- 
sive halos are approximately spherical regions of size 
r ~ tth = {SM/Anp) 1 / 3 . However, tth is not a sharp 
boundary in Lagrangian space: there is significant mass 
outside r > r-pH at early times which is found inside the 
virial radius at late times, and similarly there is consid- 
erable mass at r < tth which avoids accretion at late 
times. Indeed, we might speculate that the post-collapse 
radial distribution of the matter inside and around a dark 
matter halo is simply related to the pre-collapse profile of 
local density and potential. Since both of these quanti- 
ties are described by well-understood Gaussian statistics, 
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